27 research outputs found
Recommended from our members
Cross-Lingual Transfer of Natural Language Processing Systems
Accurate natural language processing systems rely heavily on annotated datasets. In the absence of such datasets, transfer methods can help to develop a model by transferring annotations from one or more rich-resource languages to the target language of interest. These methods are generally divided into two approaches: 1) annotation projection from translation data, aka parallel data, using supervised models in rich-resource languages, and 2) direct model transfer from annotated datasets in rich-resource languages.
In this thesis, we demonstrate different methods for transfer of dependency parsers and sentiment analysis systems. We propose an annotation projection method that performs well in the scenarios for which a large amount of in-domain parallel data is available. We also propose a method which is a combination of annotation projection and direct transfer that can leverage a minimal amount of information from a small out-of-domain parallel dataset to develop highly accurate transfer models. Furthermore, we propose an unsupervised syntactic reordering model to improve the accuracy of dependency parser transfer for non-European languages. Finally, we conduct a diverse set of experiments for the transfer of sentiment analysis systems in different data settings.
A summary of our contributions are as follows:
* We develop accurate dependency parsers using parallel text in an annotation projection framework. We make use of the fact that the density of word alignments is a valuable indicator of reliability in annotation projection.
* We develop accurate dependency parsers in the absence of a large amount of parallel data. We use the Bible data, which is in orders of magnitude smaller than a conventional parallel dataset, to provide minimal cues for creating cross-lingual word representations. Our model is also capable of boosting the performance of annotation projection with a large amount of parallel data. Our model develops cross-lingual word representations for going beyond the traditional delexicalized direct transfer methods. Moreover, we propose a simple but effective word translation approach that brings in explicit lexical features from the target language in our direct transfer method.
* We develop different syntactic reordering models that can change the source treebanks in rich-resource languages, thus preventing learning a wrong model for a non-related language. Our experimental results show substantial improvements over non-European languages.
* We develop transfer methods for sentiment analysis in different data availability scenarios. We show that we can leverage cross-lingual word embeddings to create accurate sentiment analysis systems in the absence of annotated data in the target language of interest.
We believe that the novelties that we introduce in this thesis indicate the usefulness of transfer methods. This is appealing in practice, especially since we suggest eliminating the requirement for annotating new datasets for low-resource languages which is expensive, if not impossible, to obtain
Multilingual Bidirectional Unsupervised Translation Through Multilingual Finetuning and Back-Translation
We propose a two-stage approach for training a single NMT model to translate
unseen languages both to and from English. For the first stage, we initialize
an encoder-decoder model to pretrained XLM-R and RoBERTa weights, then perform
multilingual fine-tuning on parallel data in 40 languages to English. We find
this model can generalize to zero-shot translations on unseen languages. For
the second stage, we leverage this generalization ability to generate synthetic
parallel data from monolingual datasets, then train with successive rounds of
bidirectional back-translation.
We term our approach EcXTra ({E}nglish-{c}entric Crosslingual ({X})
{Tra}nsfer). Our approach is conceptually simple, only using a standard
cross-entropy objective throughout, and also is data-driven, sequentially
leveraging auxiliary parallel data and monolingual data. We evaluate our
unsupervised NMT results on 7 low-resource languages, and find that each round
of back-translation training further refines bidirectional performance. Our
final single EcXTra-trained model achieves competitive translation performance
in all translation directions, notably establishing a new state-of-the-art for
English-to-Kazakh (22.9 > 10.4 BLEU).Comment: LoResMT @ EACL 202
External Language Model Integration for Factorized Neural Transducers
We propose an adaptation method for factorized neural transducers (FNT) with
external language models. We demonstrate that both neural and n-gram external
LMs add significantly more value when linearly interpolated with predictor
output compared to shallow fusion, thus confirming that FNT forces the
predictor to act like regular language models. Further, we propose a method to
integrate class-based n-gram language models into FNT framework resulting in
accuracy gains similar to a hybrid setup. We show average gains of 18% WERR
with lexical adaptation across various scenarios and additive gains of up to
60% WERR in one entity-rich scenario through a combination of class-based
n-gram and neural LMs